合奏是一种直接,非常有效的方法,用于提高模型在分类任务上的准确性,校准和鲁棒性;然而,其成功基础的原因仍然是研究的积极领域。我们基于PFAU(2013)的偏见变化分解的扩展,以便对分类器合奏的行为产生关键的见解。为了引入偏见变化权衡的双重重新聚集,我们首先得出了典型的分类任务的非对称损失的总期望和差异的广义定律。比较条件和引导偏置/方差估计值,我们表明条件估计必定会导致不可还原误差。接下来,我们表明在双空间中结合会降低差异并使偏差不变,而标准结合可以任意影响偏见。从经验上讲,标准的结合减少偏见,使我们假设分类器的集合可能会出现很好的表现,部分原因是这种意外的减少。我们通过对最近的深度学习方法的经验分析来结束,这些方法是在超级范围上进行整体,这表明这些技术确实有利于降低偏见偏见的偏见偏见。这表明,与经典智慧相反,靶向偏见可能是分类器合奏的有希望的方向。
translated by 谷歌翻译
随机梯度下降(SGD)是现代机器学习的支柱,是各种问题的首选优化算法。尽管SGD的经验成功通常归因于其计算效率和有利的概括行为,但两者都没有充分理解和解散它们仍然是一个开放的问题。即使在简单的凸二次问题的设置中,最坏情况分析也给SGD的渐近收敛率提供了不比全批梯度下降(GD)更好的,而SGD的所谓隐式正则作用缺乏精确的解释。在这项工作中,我们研究了高维凸四边形上多通sgd的动力学,并建立了与随机微分方程的渐近等效性,我们称之为同质化的随机梯度下降(HSGD),我们的解决方案我们以我们的解决方案的方式明确表征Volterra积分方程。这些结果为学习和风险轨迹提供精确的公式,该公式揭示了隐性条件的机制,该机制解释了SGD相对于GD的效率。我们还证明,来自SGD的噪声会对泛化性能产生负面影响,排除在这种情况下任何类型的隐式正则化的可能性。最后,我们展示了如何适应HSGD形式主义以包括流媒体SGD,这使我们能够针对相对于流SGD(Bootstrap风险)的多通SGD的多余风险产生确切的预测。
translated by 谷歌翻译
强大的机器学习模型的开发中的一个重要障碍是协变量的转变,当训练和测试集的输入分布时发生的分配换档形式在条件标签分布保持不变时发生。尽管现实世界应用的协变量转变普遍存在,但在现代机器学习背景下的理论理解仍然缺乏。在这项工作中,我们检查协变量的随机特征回归的精确高尺度渐近性,并在该设置中提出了限制测试误差,偏差和方差的精确表征。我们的结果激发了一种自然部分秩序,通过协变速转移,提供足够的条件来确定何时何时损害(甚至有助于)测试性能。我们发现,过度分辨率模型表现出增强的协会转变的鲁棒性,为这种有趣现象提供了第一个理论解释之一。此外,我们的分析揭示了分销和分发外概率性能之间的精确线性关系,为这一令人惊讶的近期实证观察提供了解释。
translated by 谷歌翻译
现代深度学习系统的区别特征之一是,它们通常采用利用巨大数量的参数,通常在数百万中使用的神经网络架构。虽然这个范例对大型网络的性质启发了重要研究,但是致力于这些网络通常用于建模大型复杂数据集的事实,而且它们本身可能包含数百万甚至数十亿的约束的事实。在这项工作中,我们专注于这种高维制度,其中数据集大小和特征数量往往是无限的。我们分析随机重量矩阵$ W $和随机偏置向量$ B $的随机特征回归的性能$ f = f(wx + b)$ b $,获取用于渐近培训的确切公式,并对数据产生的数据进行测试错误一个线性教师模型。偏差的作用可以理解为参数化在激活功能上的分布,并且我们的分析直接推广到这种分布,即使是传统的附加偏差不表达的那些分布。有趣的是,我们发现非线性的混合物可以通过最好的单一非线性来改善训练和测试误差,这表明非线性的混合物可能对近似内核方法或神经网络架构设计有用。
translated by 谷歌翻译
Extracting complex structures from grid-based data is a common key step in automated medical image analysis. The conventional solution to recovering tree-structured geometries typically involves computing the minimal cost path through intermediate representations derived from segmentation masks. However, this methodology has significant limitations in the context of projective imaging of tree-structured 3D anatomical data such as coronary arteries, since there are often overlapping branches in the 2D projection. In this work, we propose a novel approach to predicting tree connectivity structure which reformulates the task as an optimization problem over individual steps of a recursive process. We design and train a two-stage model which leverages the UNet and Transformer architectures and introduces an image-based prompting technique. Our proposed method achieves compelling results on a pair of synthetic datasets, and outperforms a shortest-path baseline.
translated by 谷歌翻译
Curriculum learning and self-paced learning are the training strategies that gradually feed the samples from easy to more complex. They have captivated increasing attention due to their excellent performance in robotic vision. Most recent works focus on designing curricula based on difficulty levels in input samples or smoothing the feature maps. However, smoothing labels to control the learning utility in a curriculum manner is still unexplored. In this work, we design a paced curriculum by label smoothing (P-CBLS) using paced learning with uniform label smoothing (ULS) for classification tasks and fuse uniform and spatially varying label smoothing (SVLS) for semantic segmentation tasks in a curriculum manner. In ULS and SVLS, a bigger smoothing factor value enforces a heavy smoothing penalty in the true label and limits learning less information. Therefore, we design the curriculum by label smoothing (CBLS). We set a bigger smoothing value at the beginning of training and gradually decreased it to zero to control the model learning utility from lower to higher. We also designed a confidence-aware pacing function and combined it with our CBLS to investigate the benefits of various curricula. The proposed techniques are validated on four robotic surgery datasets of multi-class, multi-label classification, captioning, and segmentation tasks. We also investigate the robustness of our method by corrupting validation data into different severity levels. Our extensive analysis shows that the proposed method improves prediction accuracy and robustness.
translated by 谷歌翻译
Temporal reasoning is the task of predicting temporal relations of event pairs with corresponding contexts. While some temporal reasoning models perform reasonably well on in-domain benchmarks, we have little idea of the systems' generalizability due to existing datasets' limitations. In this work, we introduce a novel task named TODAY that bridges this gap with temporal differential analysis, which as the name suggests, evaluates if systems can correctly understand the effect of incremental changes. Specifically, TODAY makes slight context changes for given event pairs, and systems need to tell how this subtle contextual change will affect temporal relation distributions. To facilitate learning, TODAY also annotates human explanations. We show that existing models, including GPT-3, drop to random guessing on TODAY, suggesting that they heavily rely on spurious information rather than proper reasoning for temporal predictions. On the other hand, we show that TODAY's supervision style and explanation annotations can be used in joint learning and encourage models to use more appropriate signals during training and outperform across several benchmarks. TODAY can also be used to train models to solicit incidental supervision from noisy sources such as GPT-3 and moves farther towards generic temporal reasoning systems.
translated by 谷歌翻译
State-of-the-art 3D semantic segmentation models are trained on the off-the-shelf public benchmarks, but they often face the major challenge when these well-trained models are deployed to a new domain. In this paper, we propose an Active-and-Adaptive Segmentation (ADAS) baseline to enhance the weak cross-domain generalization ability of a well-trained 3D segmentation model, and bridge the point distribution gap between domains. Specifically, before the cross-domain adaptation stage begins, ADAS performs an active sampling operation to select a maximally-informative subset from both source and target domains for effective adaptation, reducing the adaptation difficulty under 3D scenarios. Benefiting from the rise of multi-modal 2D-3D datasets, ADAS utilizes a cross-modal attention-based feature fusion module that can extract a representative pair of image features and point features to achieve a bi-directional image-point feature interaction for better safe adaptation. Experimentally, ADAS is verified to be effective in many cross-domain settings including: 1) Unsupervised Domain Adaptation (UDA), which means that all samples from target domain are unlabeled; 2) Unsupervised Few-shot Domain Adaptation (UFDA) which means that only a few unlabeled samples are available in the unlabeled target domain; 3) Active Domain Adaptation (ADA) which means that the selected target samples by ADAS are manually annotated. Their results demonstrate that ADAS achieves a significant accuracy gain by easily coupling ADAS with self-training methods or off-the-shelf UDA works.
translated by 谷歌翻译
As language models (LMs) scale, they develop many novel behaviors, good and bad, exacerbating the need to evaluate how they behave. Prior work creates evaluations with crowdwork (which is time-consuming and expensive) or existing data sources (which are not always available). Here, we automatically generate evaluations with LMs. We explore approaches with varying amounts of human effort, from instructing LMs to write yes/no questions to making complex Winogender schemas with multiple stages of LM-based generation and filtering. Crowdworkers rate the examples as highly relevant and agree with 90-100% of labels, sometimes more so than corresponding human-written datasets. We generate 154 datasets and discover new cases of inverse scaling where LMs get worse with size. Larger LMs repeat back a dialog user's preferred answer ("sycophancy") and express greater desire to pursue concerning goals like resource acquisition and goal preservation. We also find some of the first examples of inverse scaling in RL from Human Feedback (RLHF), where more RLHF makes LMs worse. For example, RLHF makes LMs express stronger political views (on gun rights and immigration) and a greater desire to avoid shut down. Overall, LM-written evaluations are high-quality and let us quickly discover many novel LM behaviors.
translated by 谷歌翻译
In this paper, we discuss an imitation learning based method for reducing the calibration error for a mixed reality system consisting of a vision sensor and a projector. Unlike a head mounted display, in this setup, augmented information is available to a human subject via the projection of a scene into the real world. Inherently, the camera and projector need to be calibrated as a stereo setup to project accurate information in 3D space. Previous calibration processes require multiple recording and parameter tuning steps to achieve the desired calibration, which is usually time consuming process. In order to avoid such tedious calibration, we train a CNN model to iteratively correct the extrinsic offset given a QR code and a projected pattern. We discuss the overall system setup, data collection for training, and results of the auto-correction model.
translated by 谷歌翻译